133 research outputs found
Universal Dependencies Parsing for Colloquial Singaporean English
Singlish can be interesting to the ACL community both linguistically as a
major creole based on English, and computationally for information extraction
and sentiment analysis of regional social media. We investigate dependency
parsing of Singlish by constructing a dependency treebank under the Universal
Dependencies scheme, and then training a neural network model by integrating
English syntactic knowledge into a state-of-the-art parser trained on the
Singlish treebank. Results show that English knowledge can lead to 25% relative
error reduction, resulting in a parser of 84.47% accuracies. To the best of our
knowledge, we are the first to use neural stacking to improve cross-lingual
dependency parsing on low-resource languages. We make both our annotation and
parser available for further research.Comment: Accepted by ACL 201
Impact of artificial intelligence adoption on online returns policies
The shift to e-commerce has led to an astonishing increase in online sales for retailers. However, the number of returns made on online purchases is also increasing and have a profound impact on retailers’ operations and profit. Hence, retailers need to balance between minimizing and allowing product returns. This study examines an offline showroom versus an artificial intelligence (AI) online virtual-reality webroom and how the settings affect customers’ purchase and retailers’ return decisions. A case study is used to illustrate the AI application. Our results show that adopting artificial intelligence helps sellers to make better returns policies, maximize reselling returns, and reduce the risks of leftovers and shortages. Our findings unlock the potential of artificial intelligence applications in retail operations and should interest practitioners and researchers in online retailing, especially those concerned with online returns policies and the consumer personalized service experience
Uncertainty Estimation by Fisher Information-based Evidential Deep Learning
Uncertainty estimation is a key factor that makes deep learning reliable in
practical applications. Recently proposed evidential neural networks explicitly
account for different uncertainties by treating the network's outputs as
evidence to parameterize the Dirichlet distribution, and achieve impressive
performance in uncertainty estimation. However, for high data uncertainty
samples but annotated with the one-hot label, the evidence-learning process for
those mislabeled classes is over-penalized and remains hindered. To address
this problem, we propose a novel method, Fisher Information-based Evidential
Deep Learning (-EDL). In particular, we introduce Fisher
Information Matrix (FIM) to measure the informativeness of evidence carried by
each sample, according to which we can dynamically reweight the objective loss
terms to make the network more focused on the representation learning of
uncertain classes. The generalization ability of our network is further
improved by optimizing the PAC-Bayesian bound. As demonstrated empirically, our
proposed method consistently outperforms traditional EDL-related algorithms in
multiple uncertainty estimation tasks, especially in the more challenging
few-shot classification settings
Sim-T: Simplify the Transformer Network by Multiplexing Technique for Speech Recognition
In recent years, a great deal of attention has been paid to the Transformer
network for speech recognition tasks due to its excellent model performance.
However, the Transformer network always involves heavy computation and large
number of parameters, causing serious deployment problems in devices with
limited computation sources or storage memory. In this paper, a new lightweight
model called Sim-T has been proposed to expand the generality of the
Transformer model. Under the help of the newly developed multiplexing
technique, the Sim-T can efficiently compress the model with negligible
sacrifice on its performance. To be more precise, the proposed technique
includes two parts, that are, module weight multiplexing and attention score
multiplexing. Moreover, a novel decoder structure has been proposed to
facilitate the attention score multiplexing. Extensive experiments have been
conducted to validate the effectiveness of Sim-T. In Aishell-1 dataset, when
the proposed Sim-T is 48% parameter less than the baseline Transformer, 0.4%
CER improvement can be obtained. Alternatively, 69% parameter reduction can be
achieved if the Sim-T gives the same performance as the baseline Transformer.
With regard to the HKUST and WSJ eval92 datasets, CER and WER will be improved
by 0.3% and 0.2%, respectively, when parameters in Sim-T are 40% less than the
baseline Transformer
The combination approach of SVM and ECOC for powerful identification and classification of transcription factor
<p>Abstract</p> <p>Background</p> <p>Transcription factors (TFs) are core functional proteins which play important roles in gene expression control, and they are key factors for gene regulation network construction. Traditionally, they were identified and classified through experimental approaches. In order to save time and reduce costs, many computational methods have been developed to identify TFs from new proteins and to classify the resulted TFs. Though these methods have facilitated screening of TFs to some extent, low accuracy is still a common problem. With the fast growing number of new proteins, more precise algorithms for identifying TFs from new proteins and classifying the consequent TFs are in a high demand.</p> <p>Results</p> <p>The support vector machine (SVM) algorithm was utilized to construct an automatic detector for TF identification, where protein domains and functional sites were employed as feature vectors. Error-correcting output coding (ECOC) algorithm, which was originated from information and communication engineering fields, was introduced to combine with support vector machine (SVM) methodology for TF classification. The overall success rates of identification and classification achieved 88.22% and 97.83% respectively. Finally, a web site was constructed to let users access our tools (see Availability and requirements section for URL).</p> <p>Conclusion</p> <p>The SVM method was a valid and stable means for TFs identification with protein domains and functional sites as feature vectors. Error-correcting output coding (ECOC) algorithm is a powerful method for multi-class classification problem. When combined with SVM method, it can remarkably increase the accuracy of TF classification using protein domains and functional sites as feature vectors. In addition, our work implied that ECOC algorithm may succeed in a broad range of applications in biological data mining.</p
- …